stochastic nature
A Framework for Supervised and Unsupervised Segmentation and Classification of Materials Microstructure Images
Zhang, Kungang, Apley, Daniel W., Chen, Wei, Liu, Wing K., Brinson, L. Catherine
Microstructure of materials is often characterized through image analysis to understand processing-structure-properties linkages. We propose a largely automated framework that integrates unsupervised and supervised learning methods to classify micrographs according to microstructure phase/class and, for multiphase microstructures, segments them into different homogeneous regions. With the advance of manufacturing and imaging techniques, the ultra-high resolution of imaging that reveals the complexity of microstructures and the rapidly increasing quantity of images (i.e., micrographs) enables and necessitates a more powerful and automated framework to extract materials characteristics and knowledge. The framework we propose can be used to gradually build a database of microstructure classes relevant to a particular process or group of materials, which can help in analyzing and discovering/identifying new materials. The framework has three steps: (1) segmentation of multiphase micrographs through a recently developed score-based method so that different microstructure homogeneous regions can be identified in an unsupervised manner; (2) {identification and classification of} homogeneous regions of micrographs through an uncertainty-aware supervised classification network trained using the segmented micrographs from Step $1$ with their identified labels verified via the built-in uncertainty quantification and minimal human inspection; (3) supervised segmentation (more powerful than the segmentation in Step $1$) of multiphase microstructures through a segmentation network trained with micrographs and the results from Steps $1$-$2$ using a form of data augmentation. This framework can iteratively characterize/segment new homogeneous or multiphase materials while expanding the database to enhance performance. The framework is demonstrated on various sets of materials and texture images.
Simulation-Free Determination of Microstructure Representative Volume Element Size via Fisher Scores
Liu, Wei, Mojumder, Satyajit, Liu, Wing Kam, Chen, Wei, Apley, Daniel W.
A representative volume element (RVE) is a reasonably small unit of microstructure that can be simulated to obtain the same effective properties as the entire microstructure sample. Finite element (FE) simulation of RVEs, as opposed to much larger samples, saves computational expense, especially in multiscale modeling. Therefore, it is desirable to have a framework that determines RVE size prior to FE simulations. Existing methods select the RVE size based on when the FE-simulated properties of samples of increasing size converge with insignificant statistical variations, with the drawback that many samples must be simulated. We propose a simulation-free alternative that determines RVE size based only on a micrograph. The approach utilizes a machine learning model trained to implicitly characterize the stochastic nature of the input micrograph. The underlying rationale is to view RVE size as the smallest moving window size for which the stochastic nature of the microstructure within the window is stationary as the window moves across a large micrograph. For this purpose, we adapt a recently developed Fisher score-based framework for microstructure nonstationarity monitoring. Because the resulting RVE size is based solely on the micrograph and does not involve any FE simulation of specific properties, it constitutes an RVE for any property of interest that solely depends on the microstructure characteristics. Through numerical experiments of simple and complex microstructures, we validate our approach and show that our selected RVE sizes are consistent with when the chosen FE-simulated properties converge.
RACCER: Towards Reachable and Certain Counterfactual Explanations for Reinforcement Learning
Gajcin, Jasmina, Dusparic, Ivana
While reinforcement learning (RL) algorithms have been successfully applied to numerous tasks, their reliance on neural networks makes their behavior difficult to understand and trust. Counterfactual explanations are human-friendly explanations that offer users actionable advice on how to alter the model inputs to achieve the desired output from a black-box system. However, current approaches to generating counterfactuals in RL ignore the stochastic and sequential nature of RL tasks and can produce counterfactuals that are difficult to obtain or do not deliver the desired outcome. In this work, we propose RACCER, the first RL-specific approach to generating counterfactual explanations for the behavior of RL agents. We first propose and implement a set of RL-specific counterfactual properties that ensure easily reachable counterfactuals with highly probable desired outcomes. We use a heuristic tree search of the agent's execution trajectories to find the most suitable counterfactuals based on the defined properties. We evaluate RACCER in two tasks as well as conduct a user study to show that RL-specific counterfactuals help users better understand agents' behavior compared to the current state-of-the-art approaches.
How to Develop a Random Forest Ensemble in Python - MachineLearningMastery.com How to Develop a Random Forest Ensemble in Python - MachineLearningMastery.com
The effect is that the predictions, and in turn, prediction errors, made by each tree in the ensemble are more different or less correlated. When the predictions from these less correlated trees are averaged to make a prediction, it often results in better performance than bagged decision trees. Perhaps the most important hyperparameter to tune for the random forest is the number of random features to consider at each split point. Random forests' tuning parameter is the number of randomly selected predictors, k, to choose from at each split, and is commonly referred to as mtry. In the regression context, Breiman (2001) recommends setting mtry to be one-third of the number of predictors.
Feature Selection For Machine Learning in Python - MachineLearningMastery.com Feature Selection For Machine Learning in Python - MachineLearningMastery.com
The data features that you use to train your machine learning models have a huge influence on the performance you can achieve. Irrelevant or partially relevant features can negatively impact model performance. In this post you will discover automatic feature selection techniques that you can use to prepare your machine learning data in python with scikit-learn. Feature Selection For Machine Learning in Python Photo by Baptiste Lafontaine, some rights reserved. Feature selection is a process where you automatically select those features in your data that contribute most to the prediction variable or output in which you are interested.
Dropout Regularization in Deep Studying Fashions With Keras - Channel969
A easy and highly effective regularization method for neural networks and deep studying fashions is dropout. On this submit you'll uncover the dropout regularization method and how one can apply it to your fashions in Python with Keras. After studying this submit you'll know: Dropout Regularization in Deep Studying Fashions With Keras Photograph by Trekking Rinjani, some rights reserved. Dropout is a regularization method for neural community fashions proposed by Srivastava, et al. of their 2014 paper Dropout: A Easy Method to Forestall Neural Networks from Overfitting (obtain the PDF). Dropout is a way the place randomly chosen neurons are ignored throughout coaching.
Evaluate the Performance Of Deep Learning Models in Keras
Keras is an easy to use and powerful Python library for deep learning. There are a lot of decisions to make when designing and configuring your deep learning models. Most of these decisions must be resolved empirically through trial and error and evaluating them on real data. As such, it is critically important to have a robust way to evaluate the performance of your neural networks and deep learning models. In this post you will discover a few ways that you can use to evaluate model performance using Keras.
Growing and Pruning Ensembles in Python
Ensemble member selection refers to algorithms that optimize the composition of an ensemble. This may involve growing an ensemble from available models or pruning members from a fully defined ensemble. The goal is often to reduce the model or computational complexity of an ensemble with little or no effect on the performance of an ensemble, and in some cases find a combination of ensemble members that results in better performance than blindly using all contributing models directly. In this tutorial, you will discover how to develop ensemble selection algorithms from scratch. Growing and Pruning Ensembles in Python Photo by FaBio C, some rights reserved. Voting and stacking ensembles typically combine the predictions from a heterogeneous group of model types.
Detection of Anomalies in a Time Series Data using InfluxDB and Python
Anih, Tochukwu John, Bede, Chika Amadi, Umeokpala, Chima Festus
Analysis of water and environmental data is an important aspect of many intelligent water and environmental system applications where inference from such analysis plays a significant role in decision making. Quite often these data that are collected through sensible sensors can be anomalous due to different reasons such as systems breakdown, malfunctioning of sensor detectors, and more. Regardless of their root causes, such data severely affect the results of the subsequent analysis. This paper demonstrates data cleaning and preparation for time-series data and further proposes cost-sensitive machine learning algorithms as a solution to detect anomalous data points in time-series data. The following models: Logistic Regression, Random Forest, Support Vector Machines have been modified to support the cost-sensitive learning which penalizes misclassified samples thereby minimizing the total misclassification cost. Our results showed that Random Forest outperformed the rest of the models at predicting the positive class (i.e anomalies). Applying predictive model improvement techniques like data oversampling seems to provide little or no improvement to the Random Forest model. Interestingly, with recursive feature elimination, we achieved a better model performance thereby reducing the dimensions in the data. Finally, with Influxdb and Kapacitor the data was ingested and streamed to generate new data points to further evaluate the model performance on unseen data, this will allow for early recognition of undesirable changes in the drinking water quality and will enable the water supply companies to rectify on a timely basis whatever undesirable changes abound.
Autoencoder Feature Extraction for Classification
Autoencoder is a type of neural network that can be used to learn a compressed representation of raw data. An autoencoder is composed of an encoder and a decoder sub-models. The encoder compresses the input and the decoder attempts to recreate the input from the compressed version provided by the encoder. After training, the encoder model is saved and the decoder is discarded. The encoder can then be used as a data preparation technique to perform feature extraction on raw data that can be used to train a different machine learning model.